-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unsorted decompressed chunk path even if we have sorted ones #6879
base: main
Are you sure you want to change the base?
Conversation
The unsorted paths are better for hash aggregation, but currently in this case we are only going to add sorted paths.
Add ANALYZE. To keep the desired MergeAppend plans, we also have to add a LIMIT everywhere so that the MergeAppend is chosen based on its lower startup cost. Otherwise the plain Sort over Append will be chosen because for small tables its cost is less.
Add ANALYZE after compression. The plan changes are expected, SeqScans are preferred over IndexScans and Sort over MergeAppend for small tables.
We would add extra Sort nodes when adjusting the children of space partitioning MergeAppend under ChunkAppend. This is not needed because MergeAppend plans add the required Sort themselves, and in general no adjustment seems to be required for the MergeAppend children specifically there.
…ppend_partially_compressed-* ordered_append-*
…y_compressed-* ordered_append-*
…append_partially_compressed-* ordered_append-*
This reverts commit e94bd26.
Group Key: _hyper_31_114_chunk.device_id | ||
-> Sort | ||
Sort Key: _hyper_31_114_chunk.device_id | ||
-> Gather | ||
Workers Planned: 2 | ||
-> Parallel Append |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be GatherMerge above Sort, will be addressed here: #7547
/* | ||
* Check if this path is parameterized on a compressed | ||
* column. Ideally those paths wouldn't be generated | ||
* in the first place but since we create compressed | ||
* EquivalenceMembers for all EquivalenceClasses these | ||
* Paths can happen and will fail at execution since | ||
* the left and right side of the expression are not | ||
* compatible. Therefore we skip any Path that is | ||
* parameterized on a compressed column here. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I fixed this some time ago, we shouldn't be creating EquivalenceMembers on compressed columns of compressed chunk table anymore because they don't make sense anyway. Removed this check and the tests for older issues still pass.
Did this have any effect on planning time with many compressed chunks? |
@@ -200,8 +200,8 @@ generate_series(1,3) device; | |||
Sort Method: top-N heapsort | |||
-> Custom Scan (DecompressChunk) on _hyper_1_3_chunk (actual rows=30 loops=1) | |||
Filter: (device = ANY ('{1,2,3}'::integer[])) | |||
-> Index Scan using compress_hyper_2_6_chunk_device__ts_meta_min_1__ts_meta_max_idx on compress_hyper_2_6_chunk (actual rows=3 loops=1) | |||
Index Cond: (device = ANY ('{1,2,3}'::integer[])) | |||
-> Seq Scan on compress_hyper_2_6_chunk (actual rows=3 loops=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a regression? We have a constraint on the first index column so the index should be beneficial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For small tables I think it happens often that the Seq Scan is chosen instead of Index Scan. E.g. we often see this change after adding analyze
.
There's a 5% regression on a couple of queries in the planning suite. I'll see if I can optimize this somehow. There were some changes in the "ordered_append_planning" suite, but it's actually an execution time change, I verified manually, and changed the queries to use The reason for the execution time change is that This happens for queries like |
The unsorted paths are better for hash aggregation, but currently if we're doing aggregation and we can push down the sort, we are only going to add sorted paths.
Fixes #6836
Fixes #7084